منابع مشابه
Word n-gram probability estimation from a Japanese raw corpus
Statistical language modeling plays an important role in a state-of-the-art speech recognizer. The most used language model (LM) is word -grammodel, which is based on the frequency of words and word sequences in a corpus. In various Asian languages, however, words are not delimited by whitespace, so we need to annotate sentences with word boundary information to prepare a statistically reliable...
متن کاملStory Cloze Task: UW NLP System
This paper describes University of Washington NLP’s submission for the Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem 2017) shared task—the Story Cloze Task. Our system is a linear classifier with a variety of features, including both the scores of a neural language model and style features. We report 75.2% accuracy on the task. A further discussion of our results c...
متن کاملHybrid N-gram Probability Estimation in Morphologically Rich Languages
N-gram language modeling is essential in natural language processing and speech processing. In morphologically rich languages such as Korean, a word usually consists of at least one lemma (content morpheme) and functional morphemes which represent various grammatical. Most word forms in Korean, however, have problems of sparse data and zero probability, because of quite complex morpheme combina...
متن کاملPredicting Cloze Task Quality for Vocabulary Training
Computer generation of cloze tasks still falls short of full automation; most current systems are used by teachers as authoring aids. Improved methods to estimate cloze quality are needed for full automation. We investigated lexical reading difficulty as a novel automatic estimator of cloze quality, to which cooccurrence frequency of words was compared as an alternate estimator. Rather than rel...
متن کاملTask adaptation using MAP estimation in N-gram language modeling
This paper describes a method of task adaptation in N-gram language modeling, for accurately estimating the N-gram statistics from the small amount of data of the target task. Assuming a task-independent N-gram to be a-priori knowledge, the N-gram is adapted to a target task by MAP (maximum a-posteriori probability) estimation. Experimental results showed that the perplexities of the task adapt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bridging the Methodological Divide
سال: 2014
ISSN: 1871-1340,1871-1375
DOI: 10.1075/ml.9.3.04sha